fix: handle non-UTF-8 bytes in redirect Location header#12441
fix: handle non-UTF-8 bytes in redirect Location header#12441NIK-TIGER-BILL wants to merge 1 commit intoaio-libs:masterfrom
Conversation
When a server sends a Location header containing non-UTF-8 bytes (e.g. latin-1 encoded characters like ø), aiohttp's multidict decodes them as surrogates, which then causes yarl.URL() to create a malformed URL. This fix detects surrogate characters in the decoded Location/URI header and falls back to latin-1 decoding (per RFC 7230) using the raw header bytes before parsing the redirect URL. Regression test added for: aio-libs#10047 Signed-off-by: NIK-TIGER-BILL <nik.tiger.bill@github.com>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #12441 +/- ##
==========================================
- Coverage 99.10% 98.91% -0.20%
==========================================
Files 130 134 +4
Lines 45446 46786 +1340
Branches 2398 2434 +36
==========================================
+ Hits 45040 46279 +1239
- Misses 275 376 +101
Partials 131 131
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Merging this PR will not alter performance
Comparing Footnotes
|
|
This looks the same as the existing open PR which also does not do what I suggested in the issue. |
What do these changes do?
Fixes #10047
When a server sends a
Locationheader containing non-UTF-8 bytes (e.g. latin-1 encoded characters likeø), aiohttp's multidict decodes them as surrogates. This causesyarl.URL()to create a malformed URL, leading to a 404 or other errors on redirect.This PR detects surrogate characters in the decoded
Location/URIheader and falls back to latin-1 decoding (per RFC 7230) using the raw header bytes before parsing the redirect URL.Are there changes in behavior for the user?
Yes — redirects with non-UTF-8 bytes in the
Locationheader will now be followed correctly instead of producing a malformed URL.Related issue number
Fixes #10047
Checklist
CONTRIBUTORS.txtCHANGESfolder